CPM: A large-scale generative Chinese Pre-trained language model

نویسندگان

چکیده

Pre-trained Language Models (PLMs) have proven to be beneficial for various downstream NLP tasks. Recently, GPT-3, with 175 billion parameters and 570 GB training data, drew a lot of attention due the capacity few-shot (even zero-shot) learning. However, applying GPT-3 address Chinese tasks is still challenging, as corpus primarily English, are not publicly available. In this technical report, we release Model (CPM) generative pre-training on large-scale data. To best our knowledge, CPM, 2.6 100 largest pre-trained language model, which could facilitate several tasks, such conversation, essay generation, cloze test, understanding. Extensive experiments demonstrate that CPM achieves strong performance many in settings The code available at https://github.com/TsinghuaAI/CPM.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Practical Desalinization Model for Large Scale Application

Salinity of soil and water is the most important agricultural hazard in arid and semi-aridregions. In saline soils, yield production directly influences by soluble salts in the root zone aswell as by shallow water table depth. The first step for reclamation of such soils is reducingsalinity to optimum level by leaching. The objective of this study was to develop a practicalmodel to estimate wat...

متن کامل

A Pre-Trained Ensemble Model for Breast Cancer Grade Detection Based on Small Datasets

Background and Purpose: Nowadays, breast cancer is reported as one of the most common cancers amongst women. Early detection of the cancer type is essential to aid in informing subsequent treatments. The newest proposed breast cancer detectors are based on deep learning. Most of these works focus on large-datasets and are not developed for small datasets. Although the large datasets might lead ...

متن کامل

Recognition performance of a large-scale dependency grammar language model

In this paper, we describe a large-scale investigation of dependency grammar language models. Our work includes several signi cant departures from earlier studies, notably a larger training corpus, improved model structure, different feature types, new feature selection methods, and more coherent training and test data. We report word error rate (wer) results of a speech recognition experiment,...

متن کامل

Syntax-Based Word Ordering Incorporating a Large-Scale Language Model

A fundamental problem in text generation is word ordering. Word ordering is a computationally difficult problem, which can be constrained to some extent for particular applications, for example by using synchronous grammars for statistical machine translation. There have been some recent attempts at the unconstrained problem of generating a sentence from a multi-set of input words (Wan et al., ...

متن کامل

Extending Generative Models of Large Scale Networks

Since the launch of Facebook in 2004 and Twitter in 2006, the amount of publicly available social network data has grown in both scale and complexity. This growth presents significant challenges to conventional network analysis methods that rely primarily on structure. In this paper, we describe a generative model that extends structure-based connection preference methods to include preferences...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: AI open

سال: 2021

ISSN: ['2666-6510']

DOI: https://doi.org/10.1016/j.aiopen.2021.07.001